207 research outputs found

    Gradient-free activation maximization for identifying effective stimuli

    Full text link
    A fundamental question for understanding brain function is what types of stimuli drive neurons to fire. In visual neuroscience, this question has also been posted as characterizing the receptive field of a neuron. The search for effective stimuli has traditionally been based on a combination of insights from previous studies, intuition, and luck. Recently, the same question has emerged in the study of units in convolutional neural networks (ConvNets), and together with this question a family of solutions were developed that are generally referred to as "feature visualization by activation maximization." We sought to bring in tools and techniques developed for studying ConvNets to the study of biological neural networks. However, one key difference that impedes direct translation of tools is that gradients can be obtained from ConvNets using backpropagation, but such gradients are not available from the brain. To circumvent this problem, we developed a method for gradient-free activation maximization by combining a generative neural network with a genetic algorithm. We termed this method XDream (EXtending DeepDream with real-time evolution for activation maximization), and we have shown that this method can reliably create strong stimuli for neurons in the macaque visual cortex (Ponce et al., 2019). In this paper, we describe extensive experiments characterizing the XDream method by using ConvNet units as in silico models of neurons. We show that XDream is applicable across network layers, architectures, and training sets; examine design choices in the algorithm; and provide practical guides for choosing hyperparameters in the algorithm. XDream is an efficient algorithm for uncovering neuronal tuning preferences in black-box networks using a vast and diverse stimulus space.Comment: 16 pages, 8 figures, 3 table

    Unsupervised Learning of Visual Structure using Predictive Generative Networks

    Get PDF
    The ability to predict future states of the environment is a central pillar of intelligence. At its core, effective prediction requires an internal model of the world and an understanding of the rules by which the world changes. Here, we explore the internal models developed by deep neural networks trained using a loss based on predicting future frames in synthetic video sequences, using a CNN-LSTM-deCNN framework. We first show that this architecture can achieve excellent performance in visual sequence prediction tasks, including state-of-the-art performance in a standard 'bouncing balls' dataset (Sutskever et al., 2009). Using a weighted mean-squared error and adversarial loss (Goodfellow et al., 2014), the same architecture successfully extrapolates out-of-the-plane rotations of computer-generated faces. Furthermore, despite being trained end-to-end to predict only pixel-level information, our Predictive Generative Networks learn a representation of the latent structure of the underlying three-dimensional objects themselves. Importantly, we find that this representation is naturally tolerant to object transformations, and generalizes well to new tasks, such as classification of static images. Similar models trained solely with a reconstruction loss fail to generalize as effectively. We argue that prediction can serve as a powerful unsupervised loss for learning rich internal representations of high-level object features.Comment: under review as conference paper at ICLR 201

    Theory on the Coupled Stochastic Dynamics of Transcription and Splice-Site Recognition

    Get PDF
    Eukaryotic genes are typically split into exons that need to be spliced together to form the mature mRNA. The splicing process depends on the dynamics and interactions among transcription by the RNA polymerase II complex (RNAPII) and the spliceosomal complex consisting of multiple small nuclear ribonucleo proteins (snRNPs). Here we propose a biophysically plausible initial theory of splicing that aims to explain the effects of the stochastic dynamics of snRNPs on the splicing patterns of eukaryotic genes. We consider two different ways to model the dynamics of snRNPs: pure three-dimensional diffusion and a combination of three- and one-dimensional diffusion along the emerging pre-mRNA. Our theoretical analysis shows that there exists an optimum position of the splice sites on the growing pre-mRNA at which the time required for snRNPs to find the 5β€² donor site is minimized. The minimization of the overall search time is achieved mainly via the increase in non-specific interactions between the snRNPs and the growing pre-mRNA. The theory further predicts that there exists an optimum transcript length that maximizes the probabilities for exons to interact with the snRNPs. We evaluate these theoretical predictions by considering human and mouse exon microarray data as well as RNAseq data from multiple different tissues. We observe that there is a broad optimum position of splice sites on the growing pre-mRNA and an optimum transcript length, which are roughly consistent with the theoretical predictions. The theoretical and experimental analyses suggest that there is a strong interaction between the dynamics of RNAPII and the stochastic nature of snRNP search for 5β€² donor splicing sites

    Finding any Waldo: zero-shot invariant and efficient visual search

    Get PDF
    Visual search constitutes a ubiquitous challenge in natural vision, including daily tasks such as finding a friend in a crowd or searching for a car in a parking lot. Visual search must fulfill four key properties: selectivity (to distinguish the target from distractors in a cluttered scene), invariance (to localize the target despite changes in its rotation, scale, illumination, and even searching for generic object categories), speed (to efficiently localize the target without exhaustive sampling), and generalization (to search for any object, even ones that we have had minimal or no experience with). Here we propose a computational model that is directly inspired by neurophysiological recordings during visual search in macaque monkeys, which maps the discriminative power from object recognition models to the problem of visual search. The model takes two inputs, a target object, and a search image, and produces a sequence of fixations. The model consists of a deep convolutional network that extracts features about the target object, stores those features, and uses those features in a top-down fashion to modulate the responses to the search image, thus generating a task-dependent saliency map. We show that the model fulfills the critical properties outlined above, distinguishing it from heuristic approaches such as template matching, random search, sliding windows, bottom-up saliency maps and object detection algorithms. Furthermore, we directly compare the model against human eye movement behavior during three increasingly more complex tasks where subjects have to search for a target object in a multi-object array image, in natural scenes or in the well-known Waldo search task. We show that the model provides a reasonable first-order approximation to human behavior and can efficiently find targets in an invariant manner, without any training for the target objects

    Depression-Biased Reverse Plasticity Rule Is Required for Stable Learning at Top-down Connections

    Get PDF
    Top-down synapses are ubiquitous throughout neocortex and play a central role in cognition, yet little is known about their development and specificity. During sensory experience, lower neocortical areas are activated before higher ones, causing top-down synapses to experience a preponderance of post-synaptic activity preceding pre-synaptic activity. This timing pattern is the opposite of that experienced by bottom-up synapses, which suggests that different versions of spike-timing dependent synaptic plasticity (STDP) rules may be required at top-down synapses. We consider a two-layer neural network model and investigate which STDP rules can lead to a distribution of top-down synaptic weights that is stable, diverse and avoids strong loops. We introduce a temporally reversed rule (rSTDP) where top-down synapses are potentiated if post-synaptic activity precedes pre-synaptic activity. Combining analytical work and integrate-and-fire simulations, we show that only depression-biased rSTDP (and not classical STDP) produces stable and diverse top-down weights. The conclusions did not change upon addition of homeostatic mechanisms, multiplicative STDP rules or weak external input to the top neurons. Our prediction for rSTDP at top-down synapses, which are distally located, is supported by recent neurophysiological evidence showing the existence of temporally reversed STDP in synapses that are distal to the post-synaptic cell body

    A role for recurrent processing in object completion: neurophysiological, psychophysical and computational"evidence

    Get PDF
    Recognition of objects from partial information presents a significant challenge for theories of vision because it requires spatial integration and extrapolation from prior knowledge. We combined neurophysiological recordings in human cortex with psychophysical measurements and computational modeling to investigate the mechanisms involved in object completion. We recorded intracranial field potentials from 1,699 electrodes in 18 epilepsy patients to measure the timing and selectivity of responses along human visual cortex to whole and partial objects. Responses along the ventral visual stream remained selective despite showing only 9-25% of the object. However, these visually selective signals emerged ~100 ms later for partial versus whole objects. The processing delays were particularly pronounced in higher visual areas within the ventral stream, suggesting the involvement of additional recurrent processing. In separate psychophysics experiments, disrupting this recurrent computation with a backward mask at ~75ms significantly impaired recognition of partial, but not whole, objects. Additionally, computational modeling shows that the performance of a purely bottom-up architecture is impaired by heavy occlusion and that this effect can be partially rescued via the incorporation of top-down connections. These results provide spatiotemporal constraints on theories of object recognition that involve recurrent processing to recognize objects from partial information
    • …
    corecore